In information theory the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of compression.
Roughly speaking, the theorem states that although there are many series of results that may be produced by a random process, the one actually produced is most probably from a loosely-defined set of outcomes that all have approximately the same chance of being the one actually realized. (This is a consequence of the law of large numbers and ergodic theory.) Although there are individual outcomes which have a higher probability than any outcome in this set, the vast number of outcomes in the set almost guarantees that the outcome will come from the set.
In the field of Pseudorandom number generation, a candidate generator of undetermined quality whose output sequence lies too far outside the typical set by some statistical criteria is rejected as insufficiently random. Thus, although the typical set is loosely defined, practical notions arise concerning sufficient typicality.
Contents |
Given a discrete-time stationary ergodic stochastic process on the probability space , AEP is an assertion that
where denotes the process limited to duration , and or simply denotes the entropy rate of , which must exist for all discrete-time stationary processes including the ergodic ones. AEP is proved for finite-valued (i.e. ) stationary ergodic stochastic processes in the Shannon-McMillan-Breiman theorem using the ergodic theory and for any i.i.d. sources directly using the law of large numbers in both the discrete-valued case (where is simply the entropy of a symbol) and the continuous-valued case (where is the differential entropy instead). The definition of AEP can also be extended for certain classes of continuous-time stochastic processes for which a typical set exists for long enough observation time. The convergence is proven almost sure in all cases.
Given is an i.i.d. source, its time series X1, ..., Xn is i.i.d. with entropy H(X) in the discrete-valued case and differential entropy in the continuous-valued case. The weak law of large numbers gives the AEP with convergence in probability,
since the entropy is equal to the expectation of . The strong law of large number asserts the stronger almost sure convergence,
which implies the result from the weak law of large numbers.
Consider a finite-valued sample space , i.e. , for the discrete-time stationary ergodic process defined on the probability space . The AEP for such stochastic source, known as the Shannon-McMillan-Breiman theorem, can be shown using the sandwich proof by Algoet and Cover outlined as follows:
The assumptions of stationarity/ergodicity/identical distribution of random variables is not essential for the AEP to hold. Indeed, as is quite clear intuitively, the AEP requires only some form of the law of large numbers to hold, which is fairly general. However, the expression needs to be suitably generalized, and the conditions need to be formulated precisely.
We assume that the source is producing independent symbols, with possibly different output statistics at each instant. We assume that the statistics of the process are known completely, that is, the marginal distribution of the process seen at each time instant is known. The joint distribution is just the product of marginals. Then, under the condition (which can be relaxed) that for all i, for some M>0, the following holds (AEP):
where,
The proof follows from a simple application of Markov's inequality (applied to second moment of .
It is obvious that the proof holds if any moment is uniformly bounded for (again by Markov's inequality applied to rth moment).
Even this condition is not necessary, but given a non-stationary random process, it should not be difficult to test whether the AEP holds using the above method.
The AEP for non-stationary discrete-time independent process leads us to (among other results) source coding theorem for non-stationary source (with independent output symbols) and channel coding theorem for non-stationary memoryless channels.
The source coding theorem for discrete time non-stationary independent sources can be found here: source coding theorem
Channel coding theorem for discrete time non-stationary memoryless channels can be found here: noisy channel coding theorem
Discrete-time functions can be interpolated to continuous-time functions. If such interpolation is measurable, we may define the continuous-time stationary process accordingly as . If AEP holds for the discrete-time process, as in the i.i.d. or finite-valued stationary ergodic cases shown above, it automatically holds for the continuous-time stationary process derived from it by some measurable interpolation. i.e. where corresponds to the degree of freedom in time . and are the entropy per unit time and per degree of freedom respectively, defined by Shannon.
An important class of such continuous-time stationary process is the bandlimited stationary ergodic process with the sample space being a subset of the continuous functions. AEP holds if the process is white, in which case the time samples are i.i.d., or there exists , where is the nominal bandwidth, such that the -spaced time samples take values in a finite set, in which case we have the discrete-time finite-valued stationary ergodic process.
Any time-invariant operations also preserves AEP, stationarity and ergodicity and we may easily turn a stationary process to non-stationary without losing AEP by nulling out a finite number of time samples in the process.